Panako - A Scalable Acoustic Fingerprinting System Handling Time-Scale and Pitch Modification

نویسندگان

Joren Six

Marc Leman

چکیده

This paper presents a scalable granular acoustic fingerprinting system. An acoustic fingerprinting system uses condensed representation of audio signals, acoustic fingerprints, to identify short audio fragments in large audio databases. A robust fingerprinting system generates similar fingerprints for perceptually similar audio signals. The system presented here is designed to handle time-scale and pitch modifications. The open source implementation of the system is called Panako and is evaluated on commodity hardware using a freely available reference database with fingerprints of over 30,000 songs. The results show that the system responds quickly and reliably on queries, while handling time-scale and pitch modifications of up to ten percent. The system is also shown to handle GSM-compression, several audio effects and band-pass filtering. After a query, the system returns the start time in the reference audio and how much the query has been pitch-shifted or timestretched with respect to the reference audio. The design of the system that offers this combination of features is the main contribution of this paper.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Framework to Provide Fine-Grained Time-Dependent Context for Active Listening Experiences

[1] Joren Six and Marc Leman, Panako A Scalable Acoustic Fingerprinting System Handling Time-Scale and Pitch Modification in Proceedings f the 15th ISMIR Conference (ISMIR 2014) [2] Joren Six, Olmo Cornelis, and Marc Leman. TarsosDSP, a Real-Time Audio Processing Framework in Java. In Proceedings of the 53rd AES Conference (AES53rd), 2014. [3] Avery L. Wang. An Industrial-Strength Audio Search ...

متن کامل

A local fingerprinting approach for audio copy detection

This study proposes an audio copy detection system that is robust to various attacks. These include the severe pitch shift and tempo change attacks which existing systems fail to detect. First, we propose a novel two dimensional representation for audio signals called the time-chroma image. This image is based on a modification of the concept of chroma in the music literature and is shown to ac...

متن کامل

Source-filter models for time-scale pitch-scale modification of speech

This paper presents two time-scale pitch-scale modification techniques to be used in speech synthesis systems. They have been applied to Microsoft’s Whistler system, which is based on concatenative synthesis. Both methods are based on a sourcefilter model, one of them using LPC parameters and the other one using cepstral parameters. The proposed methods achieve high quality prosody modification...

متن کامل

A mixed-excitation frequency domain model for time-scale pitch-scale modification of speech

This paper presents a time-scale pitch-scale modification technique for concatenative speech synthesis. The method is based on a frequency domain source-filter model, where the source is modeled as a mixed excitation. This model is highly coupled with a compression scheme that result in compact acoustic inventories. When compared to the approach in the Whistler system using no mixed excitation,...

متن کامل

SIFT-based local spectrogram image descriptor: a novel feature for robust music identification

Music identification via audio fingerprinting has been an active research field in recent years. In the real-world environment, music queries are often deformed by various interferences which typically include signal distortions and time-frequency misalignments caused by time stretching, pitch shifting, etc. Therefore, robustness plays a crucial role in music identification technique. In this p...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Panako - A Scalable Acoustic Fingerprinting System Handling Time-Scale and Pitch Modification

نویسندگان

چکیده

منابع مشابه

A Framework to Provide Fine-Grained Time-Dependent Context for Active Listening Experiences

A local fingerprinting approach for audio copy detection

Source-filter models for time-scale pitch-scale modification of speech

A mixed-excitation frequency domain model for time-scale pitch-scale modification of speech

SIFT-based local spectrogram image descriptor: a novel feature for robust music identification

عنوان ژورنال:

اشتراک گذاری